Fault management system for a communications network

ABSTRACT

There is described a method of operating a fault management system for an access network which forms part of a communications network. In the access network, terminating lines in the form of pairs of wires extend from a local switch ( 10 ) through a series of nodes to terminal equipment provided for user of the network. Each night, system performs a series of tests on each of the lines. The results of the tests then analysed with respect of a set of parameter to identify characteristics that would indicate that a fault is likely to occur on the associated circuit within a predetermined period e.g. 1 year. Further analysis is then be carried out to establish the location of the faults in the network by measuring the degree of clustering of faults around network nodes.

[0001] This invention relates to a fault management system for managingfaults in the terminating circuits of a communications network and alsoto a method of operating such a fault management system.

[0002] A conventional communications network comprises a relativelysmall number of interconnected main switches and a much larger number oflocal switches, each of which is connected to one or two of the mainswitches. The local switches are connected to the terminating circuitsof the network and the far ends of these circuits are connected toterminal equipment such as telephone instruments provided for users ofthe network. The network formed from the main switches and localswitches is known as the core network while a network formed from theterminating circuits is known variously as an access network or a localloop. In this specification, it will be referred to as an accessnetwork. Some terminating circuits are connected to a remoteconcentrator, which may or may not have switching capability. The remoteconcentrator is then connected to a local switch. In this specification,the term “local switch” is to be interpreted to cover both localswitches and remote concentrators.

[0003] In a conventional access network, each terminating circuit isformed from a pair of copper wires. Typically, each pair of copper wirespasses through a series of nodes (or network elements) between the localswitch and terminal equipment. Examples of such nodes are primarycross-connect points, secondary cross-connect points, distributionpoints (DPs), cable nodes and joints.

[0004] Recently, optical fibres have been used to carry terminatingcircuits in access networks. In a modern access network, both pairs ofcopper wires and optical fibres are used to carry the terminatingcircuits. Where a terminating circuit is carried by an optical fibre,the circuit will typically pass through several node between the localswitch and the terminal equipment. At each node, the incoming fibre fromthe local switch is split into a group of outgoing fibres which branchout in various directions. Where a terminating circuit is carried by anoptical fibre from the local switch, the last part of the circuit may becarried by a pair of copper wire. Unfortunately, terminating circuitsare prone to faults. In the case of a terminating circuit carried by apair of copper wires, example of such faults are disconnection, a shortcircuit between two wires of a pair of wires and a short circuit betweenone of the wires and earth. In the case of a conventional access networkformed from pairs of wires, the causes of the faults include ingress ofwater into a node and also physical damage to a node.

[0005] When a customer reports a fault, the terminating circuit may betested so as to identify the cause of the fault. The fault can then berepaired. However, until the fault is repaired, the user suffers a lossof service. It is known how to perform a set of circuit tests on eachterminating circuit in an access network on a routine basis, for examplenightly. Such routine tests can detect a fault on a terminating circuit.The fault can then be repaired, possibly before the user of theterminating circuit notices a loss of service. It is also known tomeasure the operational quality of individual nodes of an accessnetwork. Where the operational quality of a node is poor, it is likelythat faults will develop in terminating circuits passing through thenode. However, lines run though a number of nodes before terminating andso as a result, locating the node from which potential faults emanate isdifficult and so efficient preventive maintenance is difficult.

[0006] According to one embodiment of the present invention there isprovided a method of operating a fault management system for acommunications network comprising a plurality of lines passing through aplurality of nodes, said method comprising the steps of:

[0007] performing a test on a plurality of said lines to obtain one ormore elements of test data for each line;

[0008] analysing the test data to identify lines with common faultcharacteristics; establishing a score for each node based on a relativemeasure of the physical clustering of lines with common faultcharacteristics for each node so as to give an indication of the node atwhich the cause of the common fault characteristic is most likely to belocated.

[0009] The cluster score gives a relative measure for each node thatindicates the node where the potential faults are grouped together andso most likely to have the same cause. The cause of all the potentialfaults can then be economically rectified perhaps before any of thepotential faults become actual (or hard) faults and thus detectable by acustomer.

[0010] This invention will now be described in more detail, by way ofexample, with reference to the accompanying drawings in which:

[0011]FIG. 1 is a block diagram of an access network and an associatedlocal switch which form part of a communications network in which thepresent invention may be used;

[0012]FIG. 2 is a block diagram showing the components of thecommunications network which are used to provide a fault managementsystem embodying the invention for the access network of FIG. 1;

[0013]FIG. 3 is a circuit diagram illustrating some of the measurementswhich are made when testing a terminating circuit;

[0014]FIG. 4 is a flow diagram illustrating the processing performed inthe fault management system in identifying faults in the network;

[0015]FIG. 5 is a table of example test data used in an example of theprocess illustrated in FIG. 4;

[0016]FIGS. 6 and 7 are schematic illustrations of a communicationsshowing a plurality of network nodes interconnected by communicationslines.

[0017] Referring now to FIG. 1, there is shown a local switch 10 and aconventional access network 12 connected to the local switch 10. Thelocal switch 10 and the access network 12 form part of a communicationsnetwork. The local switch 10 is connected to the terminating circuits orlines of the access network 12. Typically, a local switch is connectedto several thousand terminating circuits. Each terminating circuit orline passes through several nodes before reaching its respectiveterminal equipment. These nodes comprise primary cross-connect points,secondary cross-connect points, distribution points (DPs) and junctionsand examples of these nodes will be described below.

[0018] In the conventional access network 12 shown in FIG. 1, eachterminating circuit or line is formed from a pair of copper wires. Thecopper wires leave the local switch 10 in the form of one or morecables. One of these cables is shown in FIG. 1 and indicated byreference numeral 14. The far end of cable 14 from switch 10 isconnected to a primary cross-connect point 16 which may be housed in astreet cabinet or underground junction box. From the primarycross-connect point 16, the terminating lines branch out as cables inseveral directions. For simplicity, in FIG. 1 there are shown only threecables 18, 20 and 22. The far end of cable 18 is connected to a joint19. The joint 19 is connected by cable 21 to a secondary cross-connectpoint 24. The far ends of cables 20 and 22 are connected, respectively,to secondary cross-connect points 26 and 28. For reasons of simplicity,the continuations of the terminating lines beyond secondarycross-connect points 24 and 26 are not shown. The secondarycross-connect points 24, 26 and 28 are housed in junction boxes whichmay be located above or below ground.

[0019] From the secondary cross-connect point 28, the terminating linesbranch out again in several directions in the form of cables. By way ofillustration, FIG. 1 shows cables 40, 42, and 44 leaving the secondarycross-connect point 28. Cables 40 and 44 are connected, respectively, tojoints 46 and 48. Joints 46 and 48 are connected, respectively, tocables 50 and 52, the far ends of which are connected to distributionpoints 54 and 56. The far end of cable 42 is connected to a joint 60.The joint 60 is connected by cable 62 to a distribution point 64. Forreasons of simplicity, the terminating lines beyond distribution points54 and 56 are not shown.

[0020] Distribution points are implemented as junctions boxes which aretypically located on telephone poles. From each distribution point, theterminating lines branch out as single copper wire pairs to whereterminal equipment provided for a user of the network is located. By wayof illustration, FIG. 1 shows two single copper wire pairs 70, 72,leaving the distribution point 64. The far ends of copper wire pairs 70and 72 are connected, respectively, to terminal equipment 74, 76. As iswell known, terminal equipment may take various forms. For example,terminal equipment may be a telephone located in a telephone box, atelephone instrument located in a domestic house or an office, or a faxmachine or a computer located in a customer's premises. In the exampleshown in FIG. 1, each of the joints 19, 46, 48 and 60 is used to connecttwo cables together. Joints may also be used to connect two or moresmaller cables to a larger cable.

[0021] In each terminating line, the two wires of each pair aredesignated as the A wire and the B wire. At the local switch 10, inorder to supply current to the line, a bias voltage of 50V is appliedbetween the A wire and the B wire. As the bias voltage was applied inthe early exchanges by using a battery, the bias voltage is still knownas the battery voltage. In the terminal equipment, the A wire and B wireare connected by a capacitor, the presence of which may be detected whenthe terminal equipment is not in use.

[0022] The terminating lines in the access network 10 are prone tofaults. The main causes of these faults are ingress of water andphysical damage to the nodes through which the terminating lines passbetween the local switch 10 and terminal equipment. There are five mainfaults which occur due to causes arising in the nodes. These faults aredisconnection, short circuit, faulty battery voltage, earthing fault andlow insulation resistance. A disconnection arises where a terminatingline is interrupted between the local switch and the terminal equipment.A short circuit arises where the A wire and B wire of a line areconnected together. A faulty battery voltage arises where the A wire orthe B wire of a terminating line has a short circuit connection to the Bwire of another line. An earthing fault arises when the A wire or B wireis connected to earth or the A wire of another line. Low insulationresistance arises where the resistance between the A wire and the B wireor between one of the wires and earth or between one of the wires and awire of another line is below an acceptable value.

[0023] In order to detect faults in the terminating lines of the accessnetwork 12, the local switch 10 is provided with a line tester 80. Theline tester 80 may be operated from the local switch 10 or, as will beexplained in more detail below, from a remote location. The line tester80 is capable of performing various tests, examples of which will bedescribed below. Various models of line testers for local switches areavailable commercially. In the present example, the line tester 80 iseither Teradyne and Vanderhoff test equipment. In some case both typesof test equipments may be used. As well as producing resistance,capacitance and voltage measurement data for line these pieces ofequipment also further data called termination statements such as “BellLoop”, “Master Jack Loop” and “Bridged”. These termination statementsare special line conditions which the equipment is arranged to detect.

[0024] Referring now to FIG. 2, there is shown the local switch 10 andthe components of the communications network which provide a faultmanagement system for the access network 12. These components comprisethe line tester 80, a customer service system 100 for the communicationsnetwork and an access network management system 102. The line tester 80comprises a test head 104 which contains the electronic equipment forphysically making line tests and a controller 106 for the test head 104.The controller 106 takes the form of a computer. The controller 106 canbe operated from a workstation 108 connected to it and provided at thelocal exchange 10. The controller 106 is also connected to both thecustomer service system 100 and the access network management system 102and can be operated by workstations connected to either the customerservice system 100 or the access network management system 102.

[0025] The customer service system 100 is also a computer and it can beoperated from any one of a number of workstations which are connected toit. In FIG. 2, one such workstation is shown and indicated by referencenumeral 110. The customer service system 100 is used by operators of thecommunications network who have contact with the customers of thenetwork. Together with these operators, the customer service system isresponsible for providing various services to the customers.

[0026] The access network management system 102 is also a computer andit can be operated from one of a number of workstations. One of theseworkstations is shown in FIG. 2 and indicated by reference numeral 112.The access network management system 102 is responsible for managing theaccess network 12 as well as a number of other access networks in thesame general geographical area as the access network 12. The accessnetwork management system 102 manages various operations for each of theaccess networks which it manages. These operations include the provisionof new equipment, logging data on work performed by engineers in thenetwork, maintaining data on the terminating lines and nodes of eachaccess network detection and management of faults. The workstationswhich are connected to the access network management system 102 are alsoconnected to the customer service system 100. As shown in FIG. 2, thecustomer service system 100 and the access network management system 102are connected together.

[0027] Although in the present example the fault management system forthe access network 12 is formed from the line tester 80, the customerservice system 100 and the access network management system 102, thefault management system could also be provided simply by the line tester80 on its own. In order to achieve this, it would be necessary to addappropriate software to the computer which forms the controller 106. Ina small network, this might be an appropriate form of providing thefault management system. However, in a large network it is advantageousto integrate the fault management system into the customer servicesystem 100 and the access network management system 102.

[0028] The controller 106 is programmed to cause the test head 104 tomake a series of routine tests each night on each terminating line ofthe access network 12. These tests will be explained with reference tothe circuit diagram shown in FIG. 3.

[0029] In order to test a line, may be disconnected from the switch 10and connected to the test head 104. FIG. 3 shows a line 300 beingtested. The line 300 has an A wire 302 and a B wire 304. The end of line300 remote from switch 10 is connected to terminal equipment 306. Eachof the lines 302, 304 has a resistance which depends upon its diameterand the distance from the local switch to the terminal equipment 306.Each of the wires 302, 304 is coated with an insulating material. Thefunction of the insulating material is to provide insulation betweeneach wire and adjacent wires. Damage to the insulating material oroxidation of the metal of a wire can cause the resistance between twoadjacent wires to fall.

[0030] The effectiveness of the insulation between wires 302, 304 can bedetermined by measuring the resistance R1 between the A wire 302 and theB wire 304 and the resistance R2 between the B wire 304 and the A wire302. The resistances R1 and R2 may be different because of rectificationas indicated by diodes D1 and D2. For a circuit in good condition, theresistances R1 and R2 are high, greater than 1 megaohm. Damage to theinsulating material or oxidation will cause the resistances R1, R2 tofall by an amount which depends upon the severity of the damage oroxidation. If the insulating material is totally destroyed so that the Aand B wires are physically touching each other, the values ofresistances R1, R2 will depend upon the distance between the test head80 and the point of damage but will typically lie in the range 0 to 1500ohms. Oxidation can result in wires effectively touching each other.

[0031] Only the A and B wires 302, 304 of the line 300 being tested aredisconnected. In the other lines, the bias voltage of 50 volts isapplied between the A wire and the B wire. In FIG. 3, the A wires of theother lines are collectively shown by a wire 310 which is connected atthe switch 10 to earth. The B wires of the other lines are collectivelyshown by a wire 312 connected at the switch to a potential of −50 volts.

[0032] If the insulating material separating the A wire 302 or the Bwire 304 from one of the adjacent A or B wires becomes damaged, or ifone of the wires suffers oxidation, current may flow. The effectivenessof the insulation between the A and B wires 302, 304 and adjacent A andB wires can be determined by measuring the resistance R3 between A wire302 and adjacent A wires 310, the resistance R4 between the A wire 302and adjacent B wires 312, the resistance R5 between the B wire 304 andadjacent A wires 310, and the resistance R6 between the B wires 304 andadjacent B wires 312.

[0033] For a good circuit, the resistance R3, R4, R5, R6 are high,greater than 1 megohm. Damage to insulating material may cause one ormore of the resistances R3, R4, R5, R6 to fall by an amount whichdepends upon the severity of the damage. If the insulating materialbetween the A wire 302 or the B wire 304 and an adjacent wire is totallydestroyed so that the two wires are physically touching each other, theresistance between the two touching wires will depend upon the distancebetween the test head 80 and the point of damage but will typically liein the range 0 to 1500 ohms. Oxidation can also result in two wireseffectively touching each other.

[0034] The A and B wires 302, 304 and the insulating material betweenthem act as a capacitor. In FIG. 3, the capacitance between the A and Bwires is shown as having a value C1. The value of the capacitancebetween the A and B wires of a line will depend upon the length of theline. A break in the line 300 will reduce the value of capacitance C1 asmeasured from the test head 80. FIG. 3 also shows the capacitance C2between the A wire 302 and earth and the capacitance C3 between the Bwire 304 and earth.

[0035] Each night, the controller 106 causes the test head 80 to measurethe resistances R1, R2, R3, R4, R5, R6 and the capacitances C1, C2, C3for each terminating line of the access network 12. The controller 106also causes the test head 80 to check if there is terminal equipmentconnected to the end of the line. Terminal equipment has a standardcapacitance value. When terminal equipment is connected, the value ofits capacitance is subtracted from the capacitance as measured by thetest head to obtain the capacitance C1. For each terminating line, theresults of the tests are stored against its directory number in theaccess network management system 102.

[0036] The controller 106 transmits the results of the tests to theaccess network management system 102. The access network managementsystem 102 examines the results of the series of tests for eachterminating line for the presence of a suspected fault. The possiblefaults include disconnection, short circuit, a fault battery voltage, anearth fault and low insulation resistance. When a fault is suspected,the name of the fault and the results of the test for the line arestored in the access network management system 102 against its directorynumber or an identifier in the exchange associated with the line. Thedetails of the suspected faults found each night may be reviewed by anoperator of the access network management system 102. Where appropriate,the operator may give instructions for a fault to be repaired.

[0037] The network management system 102 is also arranged to carry outsome further processing of the data collected from the over-nighttesting. This further processing is designed to test potential faultsrather than actual faults so that, where appropriate, remedial work canbe carried out before the fault is detected by a customer. An overviewof the processing carried out by the network management system 102 willnow be given with respect to FIG. 4 and a detailed example of theprocessing will also be given below. The processing is initiated at step401 either automatically in response to the receipt of the appropriatedata or by a human operator and processing moves to step 403. At step403, using known methods (which will be described in detail below), thetest data for all the lines in question is analysed to identify lineswith characteristics that indicate that a fault is likely to occurwithin a predetermined period of time i.e. an anticipated hard fault(AHF). The parameters for determining this are line resistancemeasurements and the thresholds are derived from historical data;

[0038] At step 405 records of line configurations i.e. the nodes in thenetwork through which particular lines are connected are used toestablish the pattern of anticipated hard faults for each node. Thepattern is then analysed to identify and count clusters of faults instep 407. Then, at step 409, the clusters for a given node are analysedto verify that the correct number of clusters have been identified andthat the clusters are statistically significant. At step 411, theclusters of anticipated hard faults in a given node are used tocalculate a cluster node score. This score can then be used to rank thenode against other nodes through which the same set of lines pass so asto enable the identification of the most likely node from which thefaults are emanating. In other words, the cluster score can be used tolocate the cause of the anticipated faults.

[0039] At step 413, further analysis of the anticipated hard faults iscarried out and a priority score calculated for a given node. Thispriority score provides an indication of how soon a node is expected tobecome faulty and is used the establish which one of a set of nodes thatcarry the same set of lines is in most urgent need of attention. Itshould be noted that the cluster score and the priority score can beused independently or in combination. In other words, in carrying outpreventative maintenance on a given node, the indication of the nodemost likely to be the source of the anticipated hard faults can be usedindependently or in combination with the indication of the node which islikely to become most faulty soonest.

[0040] The invention will now be described further by way of a workedexample showing test data from a set of lines being processed in themanner outlined above with reference to FIG. 4. FIG. 5 shows the testdata for each of twenty six lines running from an exchange. For eachline the test data comprises four capacitance measurements between the Awire and earth, between the A wire and the B wire (both a currentmeasurement and a prior measurement) and between the B wire and earth.The data also comprises a distance measurement for each line and aseries of resistance measurements between each combination of the Awire, B wire, Battery and Earth. These correspond to the capacitancesC1, C2, C3 and resistances R1, R2, R3, R4, R5, R6 described above withreference to FIG. 3. In addition, there is a previous capacitancereading between the A and B wires and a termination flag (Term) suppliedby the Vanderhoff and/or Terradyne equipment. However, for the purposesof the present invention, only the resistance measurements between the Awire, the B wire and the Battery i.e. R4 and R6, are used.

[0041] From historical data a threshold limit is defined for themeasurements R4 and R6 below which the line to which the measurementsapply is treated as having an anticipated hard fault (AHF). Ananticipated hard fault is defined as a line which is expected, on thebasis of its R4 and R6 resistances, to become faulty (i.e. a hard fault)with a predetermined period. In the present embodiment the predefinedperiod is one year and the limit for the resistance measurements is 400kohms. This threshold may be determined by analysis of historical datafor lines which have become faulty. Alternatively the threshold can beestimated and then adjusted while the system is in use.

[0042] A noted above, FIG. 5 shows the test data for the lines emanatingfrom an exchange. It can be seen that lines 4, 5, 9 to 12, 16 to 18, 20,21, 23 and 24 all have resistance measurements between the A or B wireand battery of less than 400 kohms and as a result are classified asanticipated hard faults. FIG. 6 is a schematic representation showingnine of the twenty six lines 601 to 609 of FIG. 5 as they emanate froman exchange 610 to the exchange side of a primary connection point (PCP)cabinet 611, to the distribution side 612 of the PCP, to twodistribution points (DPs) 613, 614 and on toward customer premisesequipment (CPE) (not shown). Only nine of the twenty six lines are shownin FIG. 6 for the sake of clarity.

[0043] Each of the connection points on the exchange 610, the PCP 611,612 and the DPs 613,614 is individually identified by a letter andnumber sequence as shown in FIG. 6. These connection identifiers enablethe route that each line takes through the network nodes to the CPE tobe recorded. Accordingly, each line 601-609 has a data record associatedwith it that is stored in the access network management system 102. Therecord for each line shows data such as the telephone number associatedwith the line and the connection identifiers for each line. For example,the connection identifiers for line two 602 in FIG. 6 would be A03, E07,D08 and DP1O. These identifiers are also associated with a uniqueidentification of the node in the network to which they apply so as toenable connection identifiers on two nodes of the same type to be toldapart such as those on the two DPs 613, 614 shown in FIG. 6.

[0044] As will be understood by those skilled in the art, lines from anexchange to the CPE seldom follow an orderly path through the nodes ofthe network. In other words a line will not be connected to point A01 inthe PCP, then E01, D01 and DP01 but instead will take an effectivelyrandom route across the connection points. In some cases, lines aredeliberately mixed up so as to reduce the problems of cross-talk betweenthe cables i.e. in an attempt to avoid two or more cables running alongthe same physical path. This mixing up is carried out for example in theconnections between the E-side and the D-side of a PCP such as PCP 611,612 in FIG. 6.

[0045] An anticipated hard fault (AHF) that is identified on aparticular line may have occurred as a result of degradation of the lineat any point along its length from the exchange to the CPE. Faults(including AHFs) very often occur at the points where the line isconnected to a network node such as a PCP or DP. These are points atwhich the physical cable is more easily affected by corrosion, thebreakdown of insulation or water ingress. In FIG. 6, the points at whichthe lines that are showing an AHF according to the test data of FIG. 5are connected to network node are indicated with large black dots (•).As noted above, not all the lines emanating from the exchange 610 areshown but instead and nine example lines are shown.

[0046] As noted above, the first step 403 in the processing carried outby the network management system 102 is to identify the lines that showAHFs and this is carried out by the analysis of the data shown in FIG.5. This analysis reveals AHFs on lines 2 and 5 to 8 in the presentexample. In the next step 405, the processing analyses each node or eachone of a selection of nodes from the network. This analysis will now beexplained further with reference to an example of 28 cables from a frameof a network node (the node could be an exchange, a PCP or a DP). Theframe is represented in table 1 below by a sequence of nominalconnection identifiers 1 to 26: 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 01 2 3 4 5 6 a a a b b a a a b b b b a a a b b b a b b a b b a a

[0047] The second line of table 1 above determines whether or not theline attached at the relevant connection point is exhibiting an AHF. An“a” designates a fault free line while a “b” designates a lineexhibiting an AHF. The next step in the processing to establish thenumber of clusters of AHFs that are present for the frame. Firstly therange over which AHF clusters occur is established. In the example oftable 1 above the clusters start at line 4 and extend to line 24.Therefore the cluster range is 4 to 24 and of these lines 13 are showingAHFs (i.e. are suspect).

[0048] The next step 405 in the processing determines whether any of thelines which are not shown as AHF that are between groups of suspectlines are, in fact, misdiagnosed and should be treated as “b”s or AHFs.The basis for this element for the processing is that lines or cablesthat are situated in close proximity tend to share fault characteristicssince the cause of the fault in one line, for example water drippingdown the frame of the network node, is not in practice isolated to thatsingle line or cable. The Cluster Range i.e. the number or distancebetween two suspects (“b”s) that determines whether or not the twosuspect are part of the same cluster or are separate clusters isdetermined in accordance with the following formula:

Cluster Range=(No. in Group/No. of Suspect)^(P)

[0049] (where “p” is the range parameter which in the present embodimentis set to 0.5)

[0050] The formula refers to a group which is a subset of the data fromtable 1 selected from the first line exhibiting an AHF to the last lineto do so. In table 1 above, the group will run from position four toposition 24. The formula takes the total number of suspect in the groupbeing analysed, divides it by the total number of suspects in the groupand multiplies this to the power of the range parameter p. Therefore inthe present example, the cluster range is calculated as(24/13)^(0.5)=1.84. The cluster range is then used to determine which ofthe apparently fault free lines (“a”s in table 1 above) that arephysically located between lines that show AHFs should be treated asshowing an AHF. In other words, if there is only one “a” between two (ormore) “b”s then the “a” is treated as a “b” and part of the cluster withits adjacent “b”s i.e. 1<cluster range=1.84. If there were two “a”s thenthese would not be treated as forming a cluster with the adjacent “b”si.e. 2>cluster range=1.84. Applying the cluster range to the resultsshown in table 1 has the following results illustrated in table 2 below.TABLE 2 Position on Cluster ID Cluster ID Below Cluster Cluster TypeFrame Number (B) Number (A) Range? B 4-5 1 A 6-8 2 N B  9-12 3 A 13-15 4N B 16-18 5 A 19 5 Y B 20-21 5 A 22 5 Y B 23-24 5 Total B  3 ClustersTotal A  2 Clusters

[0051] The result of the application of the cluster range to the datafrom table 1 can be seen in the fifth column of table 2. This shows thatthe “a”s at positions 19 and 22 of table 1 have been treated as “b”sresulting in the data from positions 16 to 24 being treated as a singlecluster of AHFs. Conversely, the “a”s at positions 6 to 8 and 13 to 15are treated as legitimately indicated as fault free i.e. not part oftheir adjacent fault clusters.

[0052] Accordingly, the information recovered from the analysis of thedata of table 1 is as set out below in table 3. TABLE 3 Number ofSuspects (AHF) 13 NS Number of Clusters (A & B)  5 NC Number in Group(24 − 4) + 1 21 Number of None Suspect 21 − 13  8 NO

[0053] The total number of lines identified as suspect is thirteen andmake up a total of five clusters. The total number of lines in the groupis 21 i.e. excluding from the data in table 1 the non-faulty lines atthe beginning and end of the sequence. The total of non-suspect lineswithin the group is eight. In determining the data in table 3 above, thelines at positions 19 and 22 are treated as “b”s for the cluster scorecalculation but as “a”s for the remaining calculations.

[0054] The next stage 409 in the processing is to determine whether theclustering that has been identified is coincidental or more likely toresult from a single cause. Essentially the test is one of randomness.If the cluster pattern is random then it is treated as coincidentalwhile if it is not random it is treated as resulting from a singlecause. This is determined by calculating a cluster value as follow:${{Cluster}\quad {Value}} = {A\quad B\quad S\quad \left( \frac{{N\quad C} - {Mean}}{S\quad D} \right)}$

[0055] Where NC is found in table 3 above, SD is the standard deviationset out below along with the formula for the Mean.${Mean} = {\left( \frac{2 \times N\quad S \times {NO}}{{N\quad S} + {N\quad O}} \right) + 1}$${SD} = \sqrt{\frac{2 \times {NS} \times {{NO}\left( {{2 \times {NS} \times {NO}} - {NO} - {NS}} \right)}}{\left( {{NO} + {NS}} \right)^{2} \times \left( {{NO} + {NS} - 1} \right)}}$

[0056] These equations make up a test called the Mann Whitney U Testwhich is a test for randomness. Taking the data recovered and shown intable 3 above, the following calculations are made by the processing instep 409:${Mean} = {{\left( \frac{2 \times 13 \times 8}{13 + 8} \right) + 1} = 10.904}$${SD} = {\sqrt{\frac{2 \times 13 \times 8\left( {{2 \times 13 \times 8} - 8 - 13} \right)}{\left( {8 + 13} \right)^{2} \times \left( {8 + 13 - 1} \right)}} = 2.099}$${{Cluster}\quad {Value}} = {{A\quad B\quad {S\left( \frac{5 - 10.904}{2.099} \right)}} = 2.853}$

[0057] The cluster value is then compared to a threshold value calledthe cluster parameter. If the cluster value is above the threshold thecluster in question is treated as a valid cluster. If the cluster valueis below the threshold then it is not treated as a cluster. In thepresent embodiment, the cluster parameter is set at 1.96 which is thepoint at which there is a 95% chance of the pattern of AHFs beingnon-random according to a normal distribution. The cluster parameter canbe adjusted while the system is in use. It can be seen that in thepresent example, the cluster value of 2.853 is greater that the clusterparameter thus indicating that the data from table 1 being analysedrepresents a true (i.e. non-random) cluster.

[0058] The next step 411 in the processing of the is to calculate thepriority score for the node being analysed. This score takes in toaccount a number of different factors of historical data relating to thenode being analysed as well as the cluster value established in theprevious steps to calculate a priority score for the node. The data usedby this step in the analysis is, in the present embodiment, stored bythe network management system 102 for each node and comprises the numberof lines that are not being used i.e. the number of spare pairs, thenumber of suspect lines (or pairs), the number of working lines, thenumber of faulty lines, a previously retained percentage increase infaulty lines. The following formulae is then used to calculate thepriority score for the node.${PriorityScore} = {\left( {1 - {\left( \frac{S - \left( {{Sus} + F} \right)}{{One}(S)} \right) \times \left( \frac{{Sus} + 100}{{One}(W)} \right)}} \right) + \left( {1 \times {P1}} \right) + \left( {C \times {P2}} \right)}$

[0059] where:

[0060] S=spare lines;

[0061] Sus=suspect lines;

[0062] W=working lines;

[0063] F=faulty lines;

[0064] C=cluster value;

[0065] I=percentage increase in faulty lines; and

[0066] I^(P)=previous percentage increase in faulty lines.

[0067] The percentage increase in faulty lines I is calculated inaccordance with the following formulae:$l = {{\left( \frac{F}{F + S + W} \right) - {l^{P}\quad {if}\quad l}} < {1\quad {then}}}$

[0068] There are two further factors P1 and P2 which affect the priorityscore. These are weighting factors which can be used to adjust theperformance of the priority algorithm. The first weighting factor P1 istermed the Fault Increase Weighting Factor and in this embodiment is setto a value of 100. I is a measure of the rate of fault increase and P1governs the effect that I has on the priority score. The secondweighting factor P2 is termed the Grouping Algorithm Weighting Factorand in this embodiment is set to a value of 10. P2 governs the effectthat the cluster value C has on the priority score. The priority scorealgorithm also makes use of a function called “One” which convertsvalues of “0” to “1”.

[0069] The calculation of the priority score will now be explainedfurther with reference to an example of the E-side of a PCP that has 87lines (or pairs) running in to in, 10 are spare lines, 13 are suspect(AHFs), 65 are working (i.e. not faulty or AHFs) and 12 are known to befaulty. The suspects in this example of 87 lines are clustered in thesame pattern as show in table 1 above. The cluster value calculation isindependent of the number of lines and instead only takes in to accountthe lines in the suspect group. As a result, the cluster value for thepresent example of 87 lines will be the same as that calculated abovewith reference to the data of table 1 i.e. 2.853. In this example theprevious percentage increase in faulty pairs is 12.6%.

[0070] Accordingly, in step 411, I is calculated as follows:$l = {{\left( \frac{12}{12 + 10 + 65} \right) - 0.126} = {{0.137 - 0.126} = 0.011}}$

[0071] Thus the priority score for the node is calculated as follows:$\begin{matrix}{{PriorityScore} = {\left( {1 - {\left( \frac{10 - \left( {13 + 12} \right)}{{One}(10)} \right) \times \left( \frac{13 \times 100}{{One}(65)} \right)}} \right) +}} \\{{\left( {0.011 \times 100} \right) + \left( {2.853 \times 10} \right)}} \\{= {\left( {1 - \left( {{- 1950}/650} \right)} \right) + 1.10 + 28.53}} \\{= {4.00 + 1.10 + 28.53}} \\{= 33.63}\end{matrix}$

[0072] As noted above, the priority score is calculated for a number ofnodes in the network and can then be used to determine how work such aspreventative maintenance should be prioritised. The higher the priorityscore, then the more urgent the maintenance. FIG. 7 shows the same setof network nodes as are described above with reference to FIG. 6 butwith the addition of the priority scores and cluster scores for eachnode. Again, only nine of the 87 lines running from the exchange areillustrated for the sake of clarity.

[0073] The node with the highest cluster score and the highest priorityscore is the E-side of the PCP 611. This indicates to the networkmanager that, because there is a cluster of faults in that node, it islikely to be the source of the anticipated faults that have beendetected on the lines that run through the set of nodes that have beenanalysed. Often, as mentioned above, such clustered AHFs are caused bythe same problem such as water leaking in to the cabinet that holds thenetwork node and causing corrosion and/or short circuits. The priorityscore gives the network manager a further indication of how themaintenance of the network of FIG. 7 should be planned as it gives arelative measure of the urgency of the preventative maintenance for agiven node. In other words it gives an indication of how soon hardfaults are going to appear and how many.

[0074] In the example shown in FIG. 7, the highest priority score andthe highest cluster score both occur for the same node. Although thiswill not be an unusual situation in practice, situations are alsopossible where the highest of each of the scores occur for differentnodes. In this case, the judgement of the network manager would beneeded to decide between carrying out maintenance on the highestpriority node or the node with the greatest cluster score (or perhapsboth). It will also be clear to those skilled in the art that thecluster score system and the priority score system can be used eithertogether as noted above or independently of each other. Furthermore,although the cluster value is used in the calculation of the priorityscore for a node, it will be clear to those skilled in the art that thisis not essential and that a priority score, for use in the same manneras described above can still be calculated without taking in to accounta cluster value.

[0075] The results of the processing of the data of table 1 to producethe cluster and priority scores for each node in the network can bepresented to the user of the network management system 102 in a numberof ways. For example, the results can be presented in tabular form withcolumns showing the scores for each node. Alternatively, the results canbe displayed pictorially as shown in FIG. 7 with the scores beingpresented in boxes near a representation of the network node to whichthey relate. This can be supplemented by indications such as the blackdots (•) where lines exhibiting AHFs are attached to the network nodesso as to give a visual indication of the clustering in addition to thecluster score.

[0076] Although the present invention has been described with referenceto an access network in which each circuit is carried by a piece ofcopper wire, it may also be used for terminating circuits carried byoptical fibres.

[0077] It will be understood by those skilled in the art that theapparatus that embodies the invention could be a general purposecomputer having software arranged to provide the analysis and/orprocessing of the test data. The computer could be a single computer ora group of computers and the software could be a single program or a setof programs. Furthermore, any or all of the software used to implementthe invention can be contained on various transmission and/or storagemediums such as a floppy disc, CD-ROM, or magnetic tape so that theprogram can be loaded onto one or more general purpose computers orcould be downloaded over a computer network using a suitabletransmission medium.

[0078] Unless the context clearly requires otherwise, throughout thedescription and the claims, the words “comprise”, “comprising” and thelike are to be construed in an inclusive as opposed to an exclusive orexhaustive sense; that is to say, in the sense of “including, but notlimited to”.

1. A method of operating a fault management system for a communicationsnetwork comprising a plurality of lines passing through a plurality ofnodes, said method comprising the steps of: performing a test on aplurality of said lines to obtain one or more elements of test data foreach line; analysing the test data to identify lines with common faultcharacteristics; establishing a score for each node based on a relativemeasure of the physical clustering of lines with common faultcharacteristics for each node so as to give an indication of the node atwhich the cause of the common fault characteristic is most likely to belocated.
 2. A method according to claim 1 in which the common faultcharacteristics are those of faults that are expected to occur within apredetermined time period.
 3. A method according to claim 1 or claim 2in which the tested characteristic is a resistance measurement betweenone of the wires of the line and the battery.
 4. A method according toany preceding claim in which each cluster at each node is analysed todetermine if is statistically significant or random.
 5. A methodaccording to any preceding claim in which groups of one or more lineswhich do not display the fault characteristics and that are disposedbetween clusters of lines that do display the fault characteristics areanalysed to determine whether or not they form part of an adjacentcluster.
 6. A fault management system for a communications network, saidnetwork comprising a plurality of lines passing through a plurality ofnodes, said apparatus comprising: means operable to perform a test on aplurality of said lines to obtain one or more elements of test data foreach line; means operable to analyse the test data to identify lineswith common fault characteristics; means operable to establish a scorefor each node based on a relative measure of the physical clustering oflines with common fault characteristics for each node so as to give anindication of the node at which the cause of the common faultcharacteristic is most likely to be located.
 7. Apparatus according toclaim 6 in which the common fault characteristics are those of faultsthat are expected to occur within a predetermined time period. 8.Apparatus according to claim 6 or claim 7 in which the testedcharacteristic is a resistance measurement between one of the wires ofthe line and the battery.
 9. Apparatus according to any of claims 6 to 8in which each cluster at each node is analysed to determine if isstatistically significant or random.
 10. Apparatus according to any ofclaims 6 to 9 in which groups of one or more lines which do not displaythe fault characteristics and that are disposed between clusters oflines that do display the fault characteristics are analysed todetermine whether or not they form part of an adjacent cluster.
 11. Acomputer program or set of computer programs arranged to cause a generalpurpose computer or group of such computer to carry out the method ofclaims 1 to 5 or to provide the apparatus of claims 6 to 10.